An Input Sensitive Online Algorithm for LCS Computation

نویسنده

  • Heikki Hyyrö
چکیده

We consider the classic problem of computing (the length of) the longest common subsequence (LCS) between two strings A and B with lengths m and n, respectively. There are several input sensitive algorithms for this problem, such as the O(σn+min{Lm,L(n−L)}) algorithms by Rick [15] and Goeman and Clausen [5] and the O(σn + min{σd, Lm}) algorithms by Chin and Poon [4] and Rick [15]. Here L is the length of the LCS and d is the number of dominant matches between A and B, and σ is the alphabet size. These algorithms require O(σn) time preprocessing for both A and B. We propose a new fairly simple O(σm + min{Lm,L(n − L)}) time algorithm that works in online manner: It needs to preprocess only A, and it can process B one character at a time, without knowing the whole string B beforehand. The algorithm also adapts well to the linear space scheme of Hirschberg [6] for recovering the LCS, which was not as easy with the above-mentioned algorithms. In addition, our scheme fits well into the context of incremental string comparison [12,10]. The original algorithm of Landau et al. [12] for this problem uses O(σm + Lm) space. By using our scheme instead, the space usage becomes O(σm + min{Lm,L(n − L)}).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Faster STR-IC-LCS Computation via RLE

The constrained LCS problem asks one to find a longest common subsequence of two input strings A and B with some constraints. The STR-IC-LCS problem is a variant of the constrained LCS problem, where the solution must include a given constraint string C as a substring. Given two strings A and B of respective lengths M and N , and a constraint string C of length at most min{M,N}, the best known ...

متن کامل

Fast and Simple Computation of All Longest Common Subsequences

This paper shows that a simple algorithm produces the all-prefixes-LCSs-graph in O(mn) time for two input sequences of size m and n. Given any prefix p of the first input sequence and any prefix q of the second input sequence, all longest common subsequences (LCSs) of p and q can be generated in time proportional to the output size, once the all-prefixes-LCSs-graph has been constructed. The pro...

متن کامل

Online Scheduling of Jobs for D-benevolent instances On Identical Machines

We consider online scheduling of jobs with specic release time on m identical machines. Each job has a weight and a size; the goal is maximizing total weight of completed jobs. At release time of a job it must immediately be scheduled on a machine or it will be rejected. It is also allowed during execution of a job to preempt it; however, it will be lost and only weight of completed jobs contri...

متن کامل

Fast Linear-Space Computations of Longest Common Subsequences

Space saving techniques in computations of a longest common subsequence (LCS) of two strings are crucial in many applications, notably, in molecular sequence comparisons. For about ten years, however, the only linear-space LCS algorithm known required time quadratic in the length of the input. for all inputs. This paper reviews linear-space LCS computations in connection with two classical para...

متن کامل

A Thematic Hierarchy for Eecient Generation from Lexical-conceptual Structure

This paper describes an implemented algorithm for syntactic realization of a target-language sentence from an interlingual representation called Lexical Conceptual Structure (LCS). We provide a mapping between LCS thematic roles and Abstract Meaning Representation (AMR) relations; these relations serve as input to an oo-the-shelf generator (Nitrogen). There are two contributions of this work: (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009